Sequences, segments, and signals
The post Sequences, streams, and segments offered an answer to the the question of what’s missing in the following box:
infinite | finite | |
discrete | Stream | Sequence |
continuous | Function | ??? |
I presented a simple type of function segments, whose representation contains a length (duration) and a function.
This type implements most of the usual classes: Monoid
, Functor
, Zip
, and Applicative
, as well Comonad
, but not Monad
.
It also implements a new type class, Segment
, which generalizes the list functions length
, take
, and drop
.
The function type is simple and useful in itself. I believe it can also serve as a semantic foundation for functional reactive programming (FRP), as I’ll explain in another post. However, the type has a serious performance problem that makes it impractical for some purposes, including as implementation of FRP.
Fortunately, we can solve the performance problem by adding a simple layer on top of function segments, to get what I’ll call “signals”. With this new layer, we have an efficient replacement for function segments that implements exactly the same interface with exactly the same semantics. Pleasantly, the class instances are defined fairly simply in terms of the corresponding instances on function segments.
You can download the code for this post.
Edits:
- 2008-12-06:
dup [] = []
near the end (was[mempty]
). - 2008-12-09: Fixed
take
anddrop
default definitions (thanks to sclv) and added point-free variant. - 2008-12-18: Fixed
appl
, thanks to sclv. - 2011-08-18: Eliminated accidental emoticon in the definition of
dup
, thanks to anonymous.
The problem with function segments
The type of function segments is defined as follows:
data t :-># a = FS t (t -> a)
The domain of the function segment is from zero up to but not including the given length.
An efficiency problem becomes apparent when we look at the Monoid
instance:
instance (Ord t, Num t) => Monoid (t :-># a) where
mempty = FS 0 (error "sampling empty 't :-># a'")
FS c f `mappend` FS d g =
FS (c + d) ( t -> if t <= c then f t else g (t - c))
Concatenation (mappend
) creates a new segment that chooses, for every domain value t
, whether to use one function or another.
If the second, then t
must be shifted backward, since the function is being shifted forward.
This implementation would be fine if we use mappend
just on simple segments.
Once we get started, however, we’ll want to concatenate lots & lots of segments.
In FRP, time-varying values go through many phases (segments) as time progresses.
Each quantity is described by a single “behavior” (sometimes called a “signal”) with many, and often infinitely many, phases.
Imagine an infinite tree of concatenations, which is typical for FRP behaviors.
At every moment, one phase is active.
Every sampling must recursively discover the active phase and the accumulated domain translation (from successive subtractions) to apply when sampling that phase.
Quite commonly, concatenation trees get progressively deeper on the right (larger t
values).
In that case, sampling will get slower and slower with time.
I like to refer to these progressive slow-downs as “time leaks”. There is also a serious space leak, since all of the durations and functions that go into a composed segment will be retained.
Sequences of segments
The problem above can be solved with a simple representation change. Instead of combining functions into functions, just keep a list of simple function segments.
-- | Signal indexed by t with values of type a.
newtype t :-> a = S { unS :: [t :-># a] }
I’ll restrict in function segments to be non-empty, to keep the rest of the implementation simple and efficient.
This new representation allows for efficient monotonic sampling of signals. As old segments are passed up, they can be dropped.
What does it mean?
There’s one central question for me in defining any data type: What does it mean?
The meaning I’ll take for signals is function segments. This interpretation is made precise in a function that maps from the type to the model (meaning). In this case, simply concatenate all of the function segments:
meaning :: (Ord t, Num t) => (t :-> a) -> (t :-># a)
meaning = mconcat . unS
Specifying the meaning of a type gives users a working model, and it defines correctness of implementation. It also tells me what class instances to implement and tells users what instances to expect. If a type’s meaning implements a class then I want the type to as well. Moreover, the type’s intances have to agree with the model’s instances. I’ve described this latter principle in Simplifying semantics with type class morphisms and some other posts.
Higher-order wrappers
To keep the code below short and clear, I’ll use some functions for adding and removing the newtype wrappers.
These higher-order function apply functions inside of (:->)
representations:
inS :: ([s :-># a] -> [t :-># b])
-> ((s :-> a) -> (t :-> b))
inS2 :: ([s :-># a] -> [t :-># b] -> [u :-># c])
-> ((s :-> a) -> (t :-> b) -> (u :-> c))
Using the trick described in Prettier functions for wrapping and wrapping, the definitions are simpler than the types:
inS = result S . argument unS
inS2 = result inS . argument unS
Functor
The Functor
instance applies a given function inside the function segments inside the lists:
instance Functor ((:->) t) where
fmap h (S ss) = S (fmap (fmap h) ss)
Or, in the style of Semantic editor combinators,
instance Functor ((:->) t) where
fmap = inS . fmap . fmap
Why this definition?
Because it is correct with respect to the semantic model, i.e., the meaning of fmap
is fmap
of the meaning, i.e.,
meaning . fmap h == fmap h . meaning
which is to say that the following diagram commutes:
Proof:
meaning . fmap h
== {- fmap definition -}
meaning . S . fmap (fmap h) . unS
== {- meaning definition -}
mconcat . unS . S . fmap (fmap h) . unS
== {- unS and S are inverses -}
mconcat . fmap (fmap h) . unS
== {- fmap h distributes over mappend -}
fmap h . mconcat . unS
== {- meaning definition -}
fmap . meaning
Applicative
and Zip
Again, the meaning
functions tells us what the Applicative
instance has to mean.
We only get to choose how to implement that meaning correctly.
The Applicative
morphism properties:
meaning (pure a) == pure a
meaning (bf <*> bx) == meaning bf <*> meaning bx
Our Applicative
instance has a definition similar in simplicity and style to the Functor
instance, assuming a worker function appl
for (<*>)
:
instance (Ord t, Num t, Bounded t) => Applicative ((:->) t) where
pure = S . pure . pure
(<*>) = inS2 appl
appl :: (Ord t, Num t, Bounded t) =>
[t :-># (a -> b)] -> [t :-># a] -> [t :-># b]
This worker function is somewhat intricate. At least my implementation of it is, and perhaps there’s a simpler one.
Again, the meaning
functions tells us what the Applicative
instance has to mean.
We only get to choose how to implement that meaning correctly.
First, if either segment list runs out, the combination runs out (because the same is true for the meaning of signals).
[] `appl` _ = []
_ `appl` [] = []
If neither segment list is empty, open up the first segment.
Split the longer segment into a prefix that matches the shorter segment, and combine the two segments with (<*>)
(on function segments).
Toss the left-over piece back in its list, and continue.
(fs:fss') `appl` (xs:xss')
| fd == xd = (fs <*> xs ) : (fss' `appl` xss' )
| fd < xd = (fs <*> xs') : (fss' `appl` (xs'':xss'))
| otherwise = (fs' <*> xs ) : ((fs'':fss') `appl` xss')
where
fd = length fs
xd = length xs
(fs',fs'') = splitAt xd fs
(xs',xs'') = splitAt fd xs
A Zip
instance is easy as always with applicative functors:
instance (Ord t, Num t, Bounded t) => Zip ((:->) t) where zip = liftA2 (,)
Monoid
The Monoid
instance:
instance Monoid (t :-> a) where
mempty = S []
S xss `mappend` S yss = S (xss ++ yss)
Correctness follows from properties of mconcat
(as used in the meaning
function).
We’re really just using the Monoid
instance for the underlying representation, i.e.,
instance Monoid (t :-> a) where
mempty = S mempty
mappend = inS2 mappend
Segment
The Segment
class has length
, take
and drop
.
It’s handy to include also null
and splitAt
, both modeled after their counterparts on lists.
The new & improved Segment
class:
class Segment seg dur | seg -> dur where
null :: seg -> Bool
length :: seg -> dur
take :: dur -> seg -> seg
drop :: dur -> seg -> seg
splitAt :: dur -> seg -> (seg,seg)
-- Defaults:
splitAt d s = (take d s, drop d s)
take d s = fst (splitAt d s)
drop d s = snd (splitAt d s)
If we wanted to require dur
to be numeric, we could add a default for null
as well.
This default could be quite expensive in some cases.
(In the style of Semantic editor combinators, take = (result.result) fst splitAt
, and similarly for drop
.)
The null
and length
definitions are simple, following from to properties of mconcat
.
instance (Ord t, Num t) => Segment (t :-> a) t where
null (S xss) = null xss
length (S xss) = sum (length <$> xss)
...
The null case says that the signal is empty exactly when there are no function segments. This simple definition relies on our restriction to non-empty function segments. If we drop that restriction, we’d have to check that every segment is empty:
-- Alternative definition
null (S xss) = all null xss
The length
is just the sum of the lengths.
The tricky piece is splitAt
(used to define both take
and drop), which must assemble segments to satisfy the requested prefix length.
The last segment used might have to get split into two, with one part going into the prefix and one to the suffix.
splitAt _ (S []) = (mempty,mempty)
splitAt d b | d <= 0 = (mempty, b)
splitAt d (S (xs:xss')) =
case (d `compare` xd) of
EQ -> (S [xs], S xss')
LT -> let (xs',xs'') = splitAt d xs in
(S [xs'], S (xs'':xss'))
GT -> let (S uss, suffix) = splitAt (d-xd) (S xss') in
(S (xs:uss), suffix)
where
xd = length xs
Copointed
and Comonad
To extract an element from a signal, extract an element from its first function segment. Awkwardly, extraction will fail (produce ⊥/error) when the signal is empty.
instance Num t => Copointed ((:->) t) where
extract (S []) = error "extract: empty S"
extract (S (xs:_)) = extract xs
I’ve exploited our restriction to non-empty function segments.
Otherwise, extract
would have to skip past the empty ones:
-- Alternative definition
instance Num t => Copointed ((:->) t) where
extract (S []) = error "extract: empty S"
extract (S (xs:xss'))
| null xs = extract (S xss')
| otherwise = extract xs
The error/⊥ in this definition is dicey, as is the one for function segments.
If we allow the same abuse in order to define a list Copointed
, we can get an alternative to the first definition that is prettier but gives a less helpful error message:
instance Num t => Copointed ((:->) t) where
extract = extract . extract . unS
See the closing remarks for more about this diciness.
Finally, we have Comonad
, with its duplicate
method.
duplicate :: (t :-> a) -> (t :-> (t :-> a))
I get confused with wrapping and unwrapping, so let’s separate the definition into a packaging part and a content part.
instance (Ord t, Num t) => Comonad ((:->) t) where
duplicate = fmap S . inS dup
with content part:
dup :: (Ord t, Num t) => [t :-># a] -> [t :-># [t :-># a]]
The helper function, dup
, takes each function segment and prepends each of its tails onto the remaining list of segments.
If the segment list is empty, then it has only one tail, also empty.
dup [] = []
dup (xs:xss') = ((: xss') <$> duplicate xs) : dup xss'
Closing remarks
The definitions above use the function segment type only through its type class interfaces, and so can all of them can be generalized. Several definitions rely on the
Segment
instance, but otherwise, each method for the composite type relies on the corresponding method for the underlying segment type. (For instance,fmap
usesfmap
,(<*>)
uses(<*>)
,Segment
usesSegment
, etc.) This generality lets us replace function segments with more efficient representations, e.g., doing constant propagation, as in Reactive. We can also generalize from lists in the definitions above.Even without concatenation, function segments can become expensive when
drop
is repeatedly applied, because function shifts accumulate (e.g.,f . (+ 0.01) . (+ 0.01) ....
). A moredrop
-friendly representation for function segments would be a function and an offset. Successive drops would add offsets, andextract
would always apply the function to its offset. This representation is similar to theFunArg
comonad, mentioned in The Essence of Dataflow Programming (Section 5.2).The list-of-segments representation enables efficient monotonic sampling, simply by dropping after each sample. A variation is to use a list zipper instead of a list. Then non-monotonic sampling will be efficient as long as successive samplings are for nearby domain values. Switching to multi-directional representation would lose the space efficiency of unidirectional representations. The latter work well with lazy evaluation, often running in constant space, because old values are recycled while new values are getting evaluated.
Still another variation is to use a binary tree instead of a list, to avoid the list append in the
Monoid
instance for(t :-> a)
. A tree zipper would allow non-monotonic sampling. Sufficient care in data structure design could perhaps yield efficient random access.There’s a tension between the
Copointed
andMonoid
interfaces.Copointed
hasextract
, andMonoid
hasmempty
, so what is the value ofextract mempty
? Given thatCopointed
is parameterized over arbitrary types, the only possible answer seems to be ⊥ (as in the use oferror
above). I don’t know if the comonad laws can all hold for possibly-empty function segments or for possibly-empty signals. I’m grateful to Edward Kmett for helping me understand this conflict. He suggested using two coordinated types, one possibly-empty (a monoid) and the other non-empty (a comonad). I’m curious to see whether that idea works out.
Matt Hellige:
As I describe here, I’d prefer to write:
To my eyes, it doesn’t get much better than that.
5 December 2008, 8:59 amconal:
@Matt: Beautiful notation!!
Maybe also
for things like
5 December 2008, 12:56 pmS <~ unS
, when someone is in a left-to-right mood.newsham:
I think most programmers (myself included) would have gone straight to this efficient implementation when writing the function. I really like the approach you’ve taken and I see it as a strength of your semantic design approach. You started out with an implementation that was more principled and closer to the semantics you were going after and then refined it using the first implementation as a model to guide the efficient implementation. This gave you more time to work with the underlying ideas and gain insights into it (ie. taking longer is a benefit here). Then when you wrote the efficient implementation, you checked your work, something most programmers (myself included) wouldn’t have done if they jumped right into the efficient implementation. Further, the first model provided a means to check the work! The traditional approach would have a programmer do the work in his head and never make the reasoning explicit and probably leave this crucial part of the design undocumented.
7 December 2008, 10:24 amsclv:
Great series! This stuff is very useful. The default definitions for drop and take are backwards, by the way.
9 December 2008, 10:50 amconal:
@sclv: Thanks for the encouragement! And for catching my drop/take default swap. Fixed now in the post and code.
9 December 2008, 12:08 pmsclv:
Here are some functions I’ve found useful so far in working with signals. You might be able to clean them up and generalize them further. They work for my purposes, but feel a little overconstrained in their types at the moment.
12 December 2008, 8:24 amConal Elliott » Blog Archive » Trimming inputs in functional reactive programming:
[…] If we switch from absolute time to relative time, then trimming becomes something with familiar semantics, namely drop, as generalized and used in two of my previous posts, Sequences, streams, and segments and Sequences, segments, and signals. […]
13 December 2008, 1:27 pmsclv:
another bug. in the applicative instance, you have:
The fd and the xd are reversed.
18 December 2008, 6:12 amconal:
@sclv,
Yes indeed. Many thanks for catching these bugs. Fixed now in the text.
18 December 2008, 8:50 pm